The default network dominates neural responses to evolving movie stories

Neuroscientific studies exploring real-world dynamic perception often overlook the influence of continuous changes in narrative content. In our research, we utilize machine learning tools for natural language processing to examine the relationship between movie narratives and neural responses. By analyzing over 50,000 brain images of participants watching Forrest Gump from the studyforrest dataset, we find distinct brain states that capture unique semantic aspects of the unfolding story. The default network, associated with semantic information integration, is the most engaged during movie watching. Furthermore, we identify two mechanisms that underlie how the default network liaises with the amygdala and hippocampus. Our findings demonstrate effective approaches to understanding neural processes in everyday situations and their relation to conscious awareness.

). Several annotations (e.g., "fear") sparsely correlated with multiple semantic contexts, while many annotations generally had a low correlation value with most of the semantic contexts (e.g., "Europe"). Source data are provided as a Source Data file. Inside the plot, each x tick denotes one unique sets of semantic contexts. The "des" represented that the text data is the human description of the movie instead of the subtitle (sub). The two different text extraction methods used for LSA (cf. methods) were nonnegative matrix factorization (NMF) and singular value decomposition (SVD). We generated 200 semantic contexts via NMF-LSA and 5 semantic contexts via SVD-LSA. Thus, each box contains 12000 (200 semantic contexts x 15 subjects x 4 states) correlation values and 300 (5 sematnic contexts x 15 subjects x 4 states) for NMF-LSA and SVD-LSA respectively. Across different sets of semantic contexts, the median values of DN's semantic brain link strength were the largest. Boxplot: upper (lower) edge of the box is 25th (75th) percentile (interquartile distance); the middle line is the median value; the green triangle shows the mean value; the whiskers summarize the extreme data points of the distribution of median semantics-brain associations. Short names for Schaefer-yeo networks:

Supplementary Figure 10: The grid search of mean and max Pearson's correlation between semantic contexts and human-curated annotations.
We performed a grid search to find the optimal window length to aggregate semantic information in the movie text data (cf. methods). Combining the performance of achieving both average and max correlations with annotations. The window length of 240s was the most optimal choice. Source data are provided as a Source Data file. Figure 11: Permutation test of the DN&AM and DN&HC semantic brain link strength. We used a permutation test to assess the significance of the correlation between the timeseries of BOLD signal of DN and our derived semantic labels. Specifically, we randomly shuffled the timeseries 1000 times and calculated the correlation coefficient for each shuffle. The results are summarized in the blue histogram. The red dashed line represents the correlation coefficient for the true data points of 15 subjects. A) Permutation test for DN&AM mode. B) Permutation test for DN&HC mode. Source data are provided as a Source Data file. Figure 12: Optimal number of states selected for seven different canonical networks using the same selection procedure as Supplemental Figure 1 (A).

Supplementary
To determine the optimal number of states for the seven different canonical networks analyzed in Figure X, we used the same selection procedure as described in Supplementary  Figure 1 (A). This involved identifying the number of states that produced the highest average value of the top 10 Pearson correlation links between the Hidden Markov Model (HMM) models' state presence and extracted semantic contexts, which indicated the closest alignment of the brain activity model with movie narrative features. Source data are provided as a Source Data file. Figure 13: Comparison of average link strength between DN group and the whole group. The figure shows a comparison between the average link strength of the DN group and the entire group. The x-axis represents the window length in seconds and the y-axis represents the correlation. We obtained new sets of semantic labels for different window lengths, and calculated the semantic brain link strength for all of our brain states using these new variables. We then calculated the average link strength for the Default Network (blue line) and for all 7 networks (orange line). The plot provides a visual representation of the difference in average link strength between the DN group and the entire group. Source data are provided as a Source Data file.

Supplementary Figure 14: Comparison of PLS model contributions between DN&AM and DN&HC models. (A)
We present the comparison of loading parameters for the DN subregions between HC and AM groups using 20-fold cross-validation tests. Z-scored loading values from 20 partial datasets were compared for each subregion using a two-sample two-sided t-test. Significance levels are shown on the y-tick marks, with asterisks indicating p-values less than 0.05, 0.01, and 0.001. (B) The same procedure was applied to 200 semantic labels generated by NLP and statistically significant differences were found between the DN&HC and DN&AM groups, as indicated by p-values less than 0.001 for each pair of semantic labels using a two-sample two-sided t-test. The DN's subregion names are from Schaefer Yeo 100 atlas (Temp: temporal; Par: parietal; PFC: prefrontal cortex; pCunPCC: precuneus posterior cingulate cortex; PFCv/d/m: ventral/dorsal/medial prefrontal cortex;). The left and right hemispheres are denoted as LH and RH. Source data are provided as a Source Data file.

A B
Supplementary Figure 16: Lateralization effects of hippocampal and amygdala subregions. Panel A) displays the name of hippocampal subregions on the y-axis, and panel B) displays the name of amygdalar subregions. Each bar represents the mean of the absolute difference in activity across the four brain states, calculated for 15 subjects (4 states for each subject, 60 in total). The error bars indicate the 95% confidence interval, and significance generated by one sample t test (if absolute difference is greater than zero) is denoted by *** (p-value < 0.001) for both subregions. CA: dentate gyrus and Cornu Ammonis, HATA: hippocampal amygdala transition area; HP: hippocampus; ML: molecular layer; DG: granule cell layer of the dentate gyrus (GC-DG-ML). Source data are provided as a Source Data file. Source data are provided as a Source Data file.